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ABSTRACT 

Earlier papers introduced a method of accurately estimating the angular cosmic microwave back- 
ground (CMB) temperature power spectrum based on Gibbs sampling. Here we extend this framework 
to polarized data. All advantages of the Gibbs sampler still apply, and exact analysis of mega-pixel 
polarized data sets is thus feasible. These advantages may be even more important for polarization 
measurements than for temperature measurements. While approximate methods can alias power from 
the larger E-mode spectrum into the weaker B-mode spectrum, the Gibbs sampler (or equivalently, 
exact likelihood evaluations) allows for a statistically optimal separation of these modes in terms of 
power spectra. To demonstrate the method, we analyze two simulated data sets: 1) a hypothetical fu- 
ture CMBPol mission, with the focus on B-mode estimation; and 2) a Planck-like mission, to highlight 
the computational feasibility of the method. 

Subject headings: cosmic microwave background — cosmology: observations — methods: numerical 



1. INTRODUCTION 

Since the first detection of cosmic microwave back- 
ground (CMB) polarization (DASI; Kovac et al. 2002, 
Leitch et al. 2002) and subsequent measurement of 
the temperature-gradient (TE) cross-power spectrum by 
the Wilkinson Microwave Anisotropy Probe (WMAP; 
Kogut et al. 2003), emphasis has shifted to the mea- 
surement and analysis of the full polarization angular 
power spectra. Many experiments (Leitch ct al. 200^ 
l^adhead et al. 2004; Barkats ct al. 2005; Mo ntrov et aJj 
l2005ic iPage et al.i i2006) have improved on those early 
findings, producing measurements with a considerable 
gain in raw sensitivity. However, an important con- 
cern with all such measurements is systematic errors, 
including not only instrumental effects, observing strat- 
egy effects, and astrophysical contaminants, but also sta- 
tistical issues. It is essential to develop powerful and 
flexible data analysis tools to extract the desired infor- 
mation from the raw data reliably. In this paper we 
progress towards this goal by extendin g the previously in- 
troduced Gibbs sarnpling framework llJewell et alJl2004l: 



iWandelt et alJl2^lEriksen et al.ll200^ to polarization. 

The scientific importance of CMB polarization power 
spectra is high. For example, our current understanding 
of the optical depth, amplitude, and scalar spectral index 
hinges on what we know about the magnitude of the 
the low-£ temperatu re and polarizatio n spectra from the 
WMAP 3-year data l|Page et al.ll2006j) . Also, a detection 
of large scale B modes would give a very exciting insight 
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into primordial gravitational waves. 

Earlier Gibbs analyses of unpolari z ed CMB data 
were described bv IWandelt et all l|2004tl ; lO'Dwver et alJ 
(2004); Erikscn 'et alJ l|2004 120061) . These efforts demon- 
strated that exact analyses are indeed feasible even for 
such large data sets as the WMAP data, which comprise 
several million pixels. This is possible due to the very fa- 
vorable scaling of the Gibbs sampling algorithm. While 
brute- force likelihood evaluations scale as 0{Np^^), Npi^ 
being the number of pixels in the data set, the Gibbs 
sampler scales identically to the map making operation. 
For the special case of uncorrelated noise and symmet- 
ric beams, this reduces further to 0{N^(^). Thus, even 
Planck-sized data may be analyzed using these tools, as 
will be demonstrated in the present paper. 

Gibbs sampling thus provides an efficient route to the 
exact posterior (or likelihood). Moreover, it does not 
rely on any ad-hoc approximations. Even for the analysis 
of temperature data, this proved to be both an impor- 
tant and subtle issue llSnergel et al.l f2006: Eri ksen et alJ 
l200(il) . However, it is critical for polarization measure- 
ments, because well-known approximate methods such 
as the pseudo-Cf methods (e.g., Chon et al. 2004) can 
lead to aliasing of E-mode power into the much smaller 
B-mode power spectrum. Although i t is possible to con- 
struct ways around this problem (^Smithl |2005(1 . exact 
methods such as full likelihood evaluations or Gibbs sam- 
pling are clearly preferable solutions. 

We start by discussing the algorithms used for po- 
larized Gibbs sampling, extending the signal and power 
spectrum sampling steps from temperature to polariza- 
tion. Then we analyze simulated data to verify that the 
algorithm works and to determine the computational ef- 
ficiency of the method. 
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2. ALGORITHMS 
2.1. Overview of Gibbs Sampling 

Gibbs sampling in the polarization case is essentially 
the same as in the temperature case, with objects in- 
volved in the sampling re-defined to account for the ad- 
ditional information. For full details on the method- 
ology of Gibbs sam pling as applied to C MB analy- 
sis. see lJewell e t al. (2 004)) : iWandelt et all 1)20041) : and 
lEriksen et a'Lri|2n04) . 

Specifically, the CMB signal is generalized to a vector 
of harmonics coefficients {ajjyi, af^, af^) for each £ and 
m, where the letters T, E, and B stand for temperature, 
electric/gradient, and magnetic/curl respectively. The 
covariance matrix S of the CMB signal then becomes 
block-diagonal, with an identical 3x3 sub-matrix for 
each m value at a given £: 
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The data are pixelized maps m of the Stoke's parameters 
/, Q, U of the form 

m = As + n, (2) 

where A is a linear operator that includes convolution 
with an instrument beam and the transformation of the 
T, E, B components of the signal s into the Stokes pa- 
rameters. Note that for the rest of this paper, we will 
assume both the instrumental beam to be symmetric and 
the noise n to be uncorrelated, having a diagonal covari- 
ance matrix N. These are the reasons we can work with 
maps instead of time-ordered data. However, to simplify 
the notation we disregard in the following all issues con- 
cerning data format, beam convolutions, multi-frequency 
observations etc., and model our data as a simple sum of 
a signal term and a noise term. For the full expressions, 
see Appendix E 

Application of a galactic mask is implemented by in- 
creasing the noise variance to infinity for masked pix- 
els, or rather, by setting the inverse noise covariance to 
zero. For full details, we refer the interested reader to 
lEriksen et al. (2004) . 

As in t h e te mper ature-only cas e disc ussed in 
iJewell et all ()2004D and IWandelt et alJ l|2004D . we wish 
to sample from the P(S|d) posterior. It is typically not 
easy to evaluate P(S|d) directly, because of a large and 
dense (S -I- N) covariance matrix, nor is it easy to sam- 
ple from it directly. This is precisely the motivation 
for Gibbs sampling, which allows sampling from a joint 
density through the corresponding conditional densities. 
For the case of CMB power spectrum estimation, this is 
done by first sampling from P(S, s|d) using P(S|s, d) and 
P(s|S,d), (neither of which requires inversion of dense 
(S -I- N) matrices), and then marginalizing over s. Using 
the fact that, given a full-sky signal map the conditional 
density for the signal matrix is independent of the data 
P(S|s,d) = P(S|s), the basic Gibbs sampling scheme 
may be written in the following form, 

S'+i^p(S|s\d) (3) 

s'+i^P(s|S'+i). (4) 

Here the symbol ^ indicates sampling from the distribu- 
tion on the right hand side. The only remaining problem 



is to establish the correct sampling algorithms for each 
of the two conditional distributions for polarized data, 
and this is the topic of the following sections. 

Note that if a continuous distribution for P(S|d) is de- 
sired, as opposed to a set of individual samples, one may 
take advantage of the known analytical form of the distri- 
bution P(S|s) by applying the Blackwell- Rao estimator 
This proced ure was discussed in detail bv IWandelt et alJ 
( 2004) and iChu et al. ( 2005) for the temperature-only 
case, and the generalization to polarization is once again 
straightforward. The required modifications are written 
out in Section 

2.2. Signal Sampling 

The signal sampling equations for polarization are 
identical to those for temperature-only data, taking into 
account the generalizations mentioned above. Specifi- 
cally, the sky signal (s = x -I- y) is sampled (given the 
current covariance matrix S) by solving for the mean 
field, X, and fluctuation, y, maps 



1 + S1/2N-1s1/2 S-i/2x = 5i/2N-im, (5) 
1 + S1/2n-1s1/2 S-i/2y = e + S^/^N-i/^x, (6) 



where ^ and x ^'^^ random maps containing Gaussian 
unit variates (zero mean and unit variance) in each pixel 
for each of the /, Q, and U components®. Note that the 
symbols in these equations may be interpreted either in 
terms of pixel space or spherical harmonic space objects. 
In practice, this is implemented in terms of conversions 
between pixel and harmonic space with standard spheri- 
cal harmonics transforms. For example, the inverse noise 
covariance matrix is given by in pixel space and 

Y^N^^Y in harmonic space, where Y and Y-^ are the 
inverse and standard spherical harmonics transforms, re- 
spectively. Fo r expl icit details on such computations, see 
lEriksen et all (p006). 

The signal sampling operation is by far the most de- 
manding step of the Gibbs sampler, because it requires 
the solution of a very large linear system. Formally 
speaking, this corresponds to inverting a ^ 10® x 10® 
matrix, which clearly is not computationally feasible 
through brute -force methods. How ever, as described in 
detail by, e.g. . lEriksen et al.1 ()2004)) . the systems in equa- 
tions O and |5l may be solved by means of Conjugate Gra- 
dients (CG). The computational scaling is thus reduced 
to the most expensive step for applying the operator on 
the left hand side of the equations, which for symmet- 
ric beams and uncorrelated noise is a standard spherical 
harmonic transform. 

The efficiency of the CG technique depends critically 
on the condition number of the matrix under considera- 
tion. For our case, this is simply the highest signal-to- 
noise ratio of any mode in the system. As an example, 
for a fixed pre-conditioner it takes about 60 iterations 
to solve for the first-year WMAP data, about 120 itera- 
tions to solve for the three-year WMAP data, and about 
300 iterations to solve for the Planck 100 GHz data. 

^ Note that S^/^ and S"^/^ must be symmetric for these equa- 
tions to be valid. On the other hand, N~^/^ only has to satisfy 
N-l/2(N~l/2)^ = N"^, and may be chosen to be the Cholesky 
decomposition. 
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Fig. 1. — Gibbs sampled signal maps. The three columns show, from left to right, temperature, Stoke's Q and Stoke's U parameters. 
The three rows show, from top to bottom, the complete Gibbs samples, the mean field (Wiener filtered) maps, and the fluctuation maps. 
The mean field map provides the information content of the data, and the fluctuation map provides a random complement such that the 
sum of the two is a full-sky, noiseless sky consistent both with the the current power spectrum and the data. 



This is a particularly serious issue for CMB polariza- 
tion measurements. While these signatures by them- 
selves have a very low signal-to-noise ratio, and there- 
fore should be easy to determine on their own, the corre- 
sponding signal-to-noise ratio for temperature is tremen- 
dous. Consequently, if a main goal is to estimate the TE 
cross-spectrum, by far most of the CPU time is spent on 
temperature map convergence. On the other hand, if all 
interest lies in E- and B-modes, the temperature data 
may be disregarded completely (or alternatively condi- 
tioned on by sampling from P{af^,af^\d,ajj^)), and 
convergence is then achieved rapidly even for CMBPol 
type missions. This will be explicitly demonstrated in 
Section im 

It is possible to reduce the computational expense of 
a CG search significantly by pre-conditioning. One ap- 
proach that has proved successful so far is to pre-compute 
a subset of the coefficient matrix in equations [S] and El 
and multiply both sides of the equations by the inverted 
sub-matrix. Thus, by inverting the most problematic 
parts of the matrix by hand, the effective condition num- 
ber is greatly reduced, and significant speed-up may be 
achieved. 

Currently, our pre-conditioner is constructed indepen- 
dently for the temperature and polarization states. For 
the polarization components, it is a diagonal matrix in 
EE and BB independently, while for the TT correlations, 
it consists of a \ow-£ matrix that includes all coefhcients 
up to som e ^mav, and t hen the diagonal elements at 
higher ^'s ijEriksen et al.ll2004j) . For WMAP-type ap- 
plications, we typically used £max = 50, which requires 



52 MB of memory and about 1 minute of CPU time for 
inversion. For upcoming Planck data, it will be desirable 
to use a significantly larger pre-conditioner, and more 
realistic numbers are £max ^ 150 or 200. This will re- 
quire extensive parallelization, and has not yet been im- 
plemented in our codes. We therefore still use a serial 
pre-conditioner up to ^max = 70 in this paper, and pay 
the extra cost in CG iterations. 

2.3. Power Spectrum Sampling 

Given the (full sky) signal polarization map sampled 
from P(s|S,(i) as described above, we must sample the 
signal covariance matrix from P(S|s), which is explicitly 
given by 

P(S|s)^TT^ ^ expf-ltra,C7^y (7) 

Here we have assumed a prior of the form P(S) oc 
Y[i \ Ci\~'' (i.e., g = for a uniform prior, and q = 1 
for a Jeffreys prior), and we have defined 

e 

Cre^ Sirnsl^. (8) 

m=-l 

Each Sim represents the signal in harmonic space, and is 
a three dimensional complex-valued column vector con- 
taining the coefficients for the T-, E-, and B-modes at 
that £ and m. This d istribution is known a s the inverse 
Wishart distribution ijGupta fc N agar 2GO0fl. 

Sampling from this conditional density can be done 
with a vector generalization of the sampling algorithm 
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Fig. 2. — Close-up of the galactic center shown in Figure emphasizing how the algorithm separates E- and B-modes. Each of the 
sampled maps (the sum of the fluctuation and mean field map) are full sky maps, so decomposing the polarization into E- and B-modes 
is straightforward. The images show temperature as color and polarization overlayed as a fingerprint pattern of stripes. The stripes are 
aligned with the directio n of polarization. They are darkest where the polarization is strongest, and they disappear where the polarization 
goes to zero ICabral and Leedomilil993i) . The maximum amplitude of the polarization is given in fiK and centered under each image. The 
maps have been smoothed to 1 degree. This is an orthogonal projection of the sky, about 60 degrees wide, centered on the Galactic center. 
The WMAP KpO galactic mask is visible in the fiuctuation and mean terms. 



described in lWandelt et all l|2004D . If p X p is the size of 
the matrix being sampled (typically p — 3 for polariza- 
tion), then the required steps for sampling are: 1) sample 
n = 2£ — p + 2q vectors from a Gaussian with covari- 
ance matrix cr^; 2) compute the sum of outer products 
of these independently sampled vectors; and 3) invert 
this matrix. For full details on both the inverse Wishart 
distribution and the sampling algorithm, we refer the in- 
terested reader to chapter 3 offGuota & Naaar (2000). 

There is a caveat for i = 2. The Wishart distribution, 
from which we derive our sampling algorithm, is defined 
only if n > p; if not, the sampled matrix is singular. This 
is a problem for £ = 2 and a flat prior, since we would 
only sample one vector to form a 3 x 3 matrix. Thus, 
the algorithm breaks down for this particular case. For- 
tunately, this is not a major problem in practice. Three 
straightforward solutions are: 1) sample the 2x2 TE 
block and the B block of the matrix separately, assum- 
ing no TB or EB correlations; 2) use a Jeffrey's prior 
{q — 1); or 3) bin the quadrupole and octopole together. 



Note that all other multipoles may be sampled individ- 
ually by the above algorithm without modifications. 

Binning — As discussed bv lEriksen et alJ l)200(iD . it is 
highly desirable for the Gibbs sampler to be able to bin 
several power spectrum multipoles together. The main 
advantage of this is improved sampling efficiency: As 
currently implemented, the step size taken between two 
consecutive Gibbs samples is given by cosmic variance 
alone. The full posterior, however, is given by both cos- 
mic variance and noise. Therefore, in the low signal- 
to-noise regime, one must take a larger number of steps 
to obtain two independent samples. The easiest way of 
improving on this is simply to bin many multipoles to- 
gether, and thereby increase the signal-to-noise ratio of 
the power spectrum coefficient. In practice, we choose 
bins such that the signal-to-noise ratio is always larger 
than some limit, say 3. 

Since the CMB power spectrum is roughly propor- 
tional to l/i{£+ 1), it is convenient to define uniform 
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bins in Cd{e + 1). 



We therefore redefine for bin 



Note that there are now 



M 



1) 



mm 



(9) 



(10) 



independent spherical harmonic modes contributing to 
this power spectrum coefficient. Thus, the inverse 
Wishart distribution has n = M— p— \ + 2q degrees of 
freedom rather than n = 2£—p + 2q. With this modifica- 
tion, the basic sampling algorithm remains unchanged, 
but since we have sampled Cf, = £{£ + l)Cg and not 
C^, the actual power spectrum coefficients are given by 
Ce = Cb/e{i + 1) for each £ in bin b. 

2.4. Separation of E- and B-modes 

We now make a brief comment on the so-called E-B 
coupling problem that plagues most approximate meth- 
ods, such as the pseudo-C^ methods (see, e.g.. Smith 
2005). Briefly put, the problem lies in the fact that the 
spherical harmonics are not orthogonal on a cut sky, and 
this may result in leakage from the (much larger) E-mode 
power into the B-mode power spectrum. 

Exact methods such as exact likelihood analyses or 
Gibbs sampling do not have this problem. This may 
be understood intuitively in terms of the signal sampling 
process illustrated in Figures ^ and El Obtaining a com- 
plete sky sample for the Gibbs sampler is a two step 
process. First, one filters out as much information as 
possible from the observed data using a Wiener filter. 
Second, one replaces the lost power due to noise and 
partial sky coverage by a random fluctuation term. The 
sum of the two is a full-sky, noiseless sample that is con- 
sistent with the data. Because it is a full-sky sample, no 
E-B coupling arises. 

2.5. Blackwell-Rao Estimator 

The Gibbs sampler provides a set of samples of the sig- 
nal covariance matrix S. In practice, it is often preferable 
to have a smooth description of the probability density 
of S. In such cases, one can use the Blackwell-Rao es- 
timator, which takes advantage of the known analytical 
form of the probability distribution P(S|s) and uses the 
set of signal samples {s} — {s^, . . . , s*^} to approximate 
P(S|d). 

An intuitive understanding of the Blackwell-Rao esti- 
mator may be found in terms of the usual Gibbs sam- 
pling algorithm. Within the theory of Gibbs sampling 
(or more generally Markov Chain Monte Carlo), it is 
perfectly valid to sample one parameter more often than 
others, so long as the sampling scheme is independent 
of the current "state" of the Markov chain. In particu- 
lar, one may choose to sample S one thousand times for 
each time one samples s, and thereby obtain more power 
spectrum samples (although not sky signal samples) with 
negligible cost. The result is a smooth power spectrum 
histogram. The Blackwell-Rao estimator takes this idea 
to the extreme, and replaces the power spectrum sam- 
pling step by the corresponding analytical distribution. 



The result is a highly accurate and smooth description of 
P(S|d) that is ver y useful for, say, estimation of cosmo- 
logical parameters l< Wandelt et alJl2004l:rChu et al.ll2005t 
Erikscn et al. 2006). 

For full details on this estimator for the temperature- 
only case, we refer the interested reader to iChu et all 
(|i005). But again, the generalization to polarization 
is indeed straightforward, and the generalized estimator 
reads 



F(S|{s})(x5]n 



-2t+2q-p 



1 



where is the 3x3 sub-matrix of S for a given and 
j is the index over Gibbs samples. 

3. APPLICATION TO SIMULATED DATA 

We now apply the methodology described in Section [5] 
to simulated data. Two different cases are considered 
to highlight different features. In the first, we consider 
a low-resolution, high signal-to-noise ratio experiment 
aimed at detecting primordial B modes. The main goal 
of this exercise is to demonstrate the fact that the so- 
called E/B coupling problem that plagues approximate 
methods is not an issue for exact methods. Second, we 
consider a high-resolution simulation based on the Planck 
100 GHz channel to demonstrate that Gibbs sampling is 
feasible even for very large CMB data sets. 

3.1. Low-resolution B-mode experiment (CMBPol) 

Our first case corresponds to a possible future mission 
targeting the primordial B-modes that arise during the 
inflationary period. Such modes are expected to have a 
very low amplitude and to be limited to large angular 
scales. Some case studies for a B-mode mission therefore 
emphasize extreme sensitivity over angular resolution, 
and we adopt similar characteristics for this exercise. 

As discussed in Section the convergence ratio for 
the conjugate gradient search depends critically on the 
signal-to-noise ratio of the data. In order to achieve ac- 
ceptable performance when analyzing temperature ob- 
servations with the sensitivity required for detecting B- 
modes, a much better pre-conditioner than what we have 
currently implemented is required. We therefore only 
consider the E- and B-mode spectra here, and not the 
temperature spectrum. 

The simulated data set consists of the sum of a CMB 
component and a white noise component. The CMB real- 
ization was drawn in harmonic space from a Gaussian dis- 
tribution with a ACDM spectrum (downloaded from the 
WMAP3 parameter table at LAMBDA) having a tensor 
contribution of r ~ 0.03. Multipoles up to ^max — 512 
were included. This realization was then convolved with 
a 1° FWHM Gaussian beam and A'sido = 256 pixel win- 
dow, and projected onto a HEALPix^^ grid. Next, uni- 
form (and uncorrelated between Q and U) noise of 1 /iK 
rms was added to each pixe l. Finally, the WMAP3 po- 
larization mask ijPage et alJ l2006) was applied, removing 
26.5% of the sky from the analysis. 

We adopted a binning scheme logarithmic in such 
that hi = [2',2'+i - 1]. Note that this is not directly 

http://hcalpix.jpl.nasa.gov/ 
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Fig. 3. — Reconstructed E- and B-mode power spectra from 
the low-resolution analysis. Input spectra are shown as dashed 
and dotted lines, respectively, while the reconstructed posterior 
distributions are indicated by solid curves (posterior maximum) 
and gray regions (one and two sigma confidence regions). The 
corresponding noise spectrum is given by a thin dashed line. The 
Gelman-Rubin convergence statistic as a function of multipole is 
shown in the bottom panel. 
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connected to the signal-to-noise ratio of the data them- 
selves, and this will have consequences for the conver- 
gence properties of the high-^ B-mode bins. However, 
our main focus in this paper is the method itself, and 
this scheme is chosen to illustrate the effect of both high 
and low signal-to-noise binning, not to obtain an optimal 
power spectrum. 

The simulation was then analyzed with the Gibbs sam- 
pler described earlier, producing 1000 sky samples in 
each of five independent Markov chains. The CPU cost 
for producing one sample was 10 minutes, or a wall clock 



time of 2.5 minutes when parallelized over four proces- 
sors. The total running time was thus 42 hours using 
20 processors. For each sky sample, 20 independent 
power spectrum samples were drawn in order to obtain 
smoother Ci confidence regions. (See the discussion of 
the Blackwell-Rao estimator in Section for more de- 
tails.) 

We first consider the reconstructed auto-spectra, which 
are shown in the top panel of Figure 13 The input (un- 
binned) spectra are given by dashed and dotted lines 
for E- and B-modes, respectively, and the reconstructed 
(binned) posterior maximum spectra are shown by solid 
black lines. One and two sigma confidence regions are 
marked by gray regions. Finally, the beam deconvolved 
noise spectrum is indicated by a thin dashed line. 

In the bottom p anel, we show the Gelm an-Rubin con- 
vergence statistic ijGelman &: Rubinlll992|) as computed 
from the ai sky signal power spectra for each £. This is 
much more conservative than computing the same statis- 
tic from the samples, for two reasons. First, conver- 
gence in the binned power spectrum is achieved faster 
than convergence in each sky mode. Second, cosmic vari- 
ance only contributes to the power spectrum and not the 
signal on the sky. Therefore, this may be accounted for 
either analytically through the Blackwell-Rao estimator 
or by re-sampling the spectra given the cr^'s. In other 
words, a small error in the sky signal variance does not 
affect the full posterior significantly if the desired distri- 
bution is anyway dominated by cosmic variance. 

A general recommendation is that the Gelman-Rubin 
statistic, i?, should be less than 1.1 or 1.2 to claim con- 
vergence, although the value depends on the particular 
application and initialization procedure, and should be 
compared against other methods such as jack-knife tests. 
However, for the particular case shown in Figure 13 it is 
clear that the E-mode spectrum has converged very well 
everywhere, while the B-mode spectrum only has con- 
verged up to ^ « 60. 

As discussed in Section 12.3.0.01 this behavior can be 
understood intuitively in terms of signal-to-noise ratio. 
Since the step size between two signal samples is given 
by cosmic variance alone, while the full posterior dis- 
tribution is given by both cosmic variance and noise, it 
takes a large number of Gibbs steps to diffuse efficiently 
in the very low signal-to-noise regime. Further, the noise 
spectrum is about three orders of magnitude larger than 
the B-mode spectrum at ^ > 100, and the Gibbs sampler 
is therefore unable to probe the full distribution with a 
reasonable number of samples. 

To resolve this issue, we bin the power spectrum. How- 
ever, the binning scheme was not tuned to obtain con- 
stant signal-to-noise in each bin, but was rather arbi- 
trary. The result is clearly seen in the Gelman-Rubin 
statistic: For < 60 the signal-to-noise per bin is high, 
and convergence is excellent. At > 60, it is low, and 
the convergence is very poor. The way to resolve this 
would have been to choose larger bins at higher t&. 

In Figure^ we show the ExB cross-spectrum. As 
expected, this is nicely centered on zero. 

We end this section by commenting on the applicabil- 
ity of this formalism to a possible future CMBPol type 
mission. As is well known, the main problems for such a 
mission will not be primarily statistical issues of the type 
discussed above, but rather systematics in various forms. 
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Fig. 5. — Reconstructed power spectra from the high-resolution Planck 100 GHz simulation. The true spectra are shown as dashed lines, 
and the reconstructed posterior distributions are given by a maximum posterior value (solid lines) and 68 and 95% confidence region. The 
Gelman-Rubin convergence statistics are shown in the bottom panels. 



Two important examples are correlated noise and asym- 
metric beams. However, if it is possible to pre-compute 
the complete A"^N~-'^A matrix for the data set under 
consideration, then these two important effects may be 
fully accounted for using the methods described here. 
And for a low-resolution CMBPol mission this may be 
possible. For an upper multipole limit of, say, ^max — 300 
there is a total of 2x90 000 180 000 polarized spher- 
ical modes to account for. In other words, one has to 
store and invert a 180 000 x 180 000 matrix in order to 
analyze such an experiment exactly. Although this is a 
considerable computational problem, it is quite tractable 
already with current computers. Thus, if it is possible to 
compute this matrix in the first place for a given exper- 
iment, an exact and complete analysis is feasible using 
the methods described in this paper. 



3.2. High-resolution T+E experiment (Planck) 

In order to demonstrate the feasibility of this method 
for analyzing even the largest planned data set, we now 
consider a simulation with properties similar to those of 
the Planck 100 GHz instrument. Specifically, the grid 
resolution is chosen to be Aside = 1024 (corresponding 
to a 3.4' pixel size), the maximum multipole moment is 
^sidc — 1500, the beam size is 9.5', and the noise level is 
38.2 RMS per pixel for temperature and 61 /xK per 
Q/U pixel. These noise levels are a factor of two higher 
than the goal levels for the 100 GHz channel, given in the 
Planck bluebook^-"^. No noise correlations between T, Q 
and U were included, the B-mode spectrum was set to 
zero, and the sky cut was chosen to be the WMAP Kp2 
mask. 

Such large data sets are certainly a challenge for the 
Gibbs sampling algorithm, and the computational re- 
quirements are considerable. Specifically, the CPU time 
for generating one sample (requiring ^ 250-300 CG it- 
erations) is about 16 CPU hours when using the low-£ 
preconditioner described by Eriksen et al. (2004) up to 
€ = 70. Better preconditioners will of course reduce this 
cost significantly. 

1 1 http: / /www.rssd.esa.int/SA /PLANCK /docs/Bluobook- 

ESA-SCI(2005) l.V2.pdf 



However, it is important to note that even though this 
is an expensive operation, it is by no means prohibitive. 
To obtain a reasonably well converged posterior distri- 
bution, one requires on the order of ~ 10'^ independent 
samples, and this would then require ^ 10'* CPU hours. 
Of course, this number must be multiplied with a signifi- 
cant factor for an actual production analysis (e.g, number 
of frequency bands or data combinations), but consider- 
ing the tremendous efforts spent on obtaining the Planck 
data in the first place, this amount of CPU time is a most 
reasonable cost for analyzing them. 

For the high-resolution analysis presented in this pa- 
per, we produced a total of 800 sky samples, divided 
over eight independent chains. Again, 20 independent 
power spectrum samples were then drawn from each of 
these for visualization purposes. The results from these 
computations are summarized in Figure |S1 showing both 
the reconstructed power spectra and the corresponding 
convergence statistics. 

With the chosen binning scheme and number of sam- 
ples, we see that the TT spectrum has converged well ev- 
erywhere, while the TE spectrum has some small prob- 
lems at the end of the second bin. The EE spectrum 
would clearly have benefited from more samples, and 
even more importantly, slightly larger bins; increasing 
the bin size by, say, 20% would have resolved both the 
TE and EE issues. 

However, as far as computational feasibility goes, the 
important part is the signal sampling step, and not bin- 
ning or re-sampling issues; these can always be adjusted 
given some crude knowledge of the data set under con- 
sideration. Therefore, the fact that already this first im- 
plementation of the polarized Gibbs sampler is able to 
produce hundreds of sky samples with only a few days on 
a standard computer cluster, is a direct demonstration of 
computational feasibility for even Planck-sized data sets. 

4. CONCLUSIONS 

This paper extends the Gibbs sampling technique to 
polarized power spectrum estimation. We have detailed 
the necessary generalization steps relative to the origi- 
nal temperature-only d escriptions given bv [Je well et alJ 
l|200^ . IWalldelt et all f2004) and lEriksen eTal. (2004D . 
and have considered computational aspects of polarized 
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analysis. 

The algorithm was demonstrated with two specific ex- 
amples. First, considering a possible CMBPol type mis- 
sion, we showed that the Gibbs sampler cleanly separates 
E- and B-modes, and no special care is required. This 
is in sharp contrast to approximate methods such as so- 
called the pseudo-Cf , for which great care must be taken 
in order for the larger E-modes not to compromise the 
minute B-modes. 

Second, we analyzed a Planck-sized data set, demon- 
strating that the algorithm is useful for analyzing the 
quantity of data which will come from near-future CMB 
experiments. 

The Gibbs sampling results presented here use sym- 
metric beams and noise which is uncorrelated between 
pixels. However, the Gibbs sampling algorithm has po- 
tential to analyze considerably more complicated data 
sets than these. For Planck, the solution lies in exploit- 
ing the very regular scanning strategy, which reduces the 
computational burden of a time-ordered data analysis. 
For a future CMBPol mission, the solution lies in the rel- 
atively large angular scales required. Since it is possible 
to invert the noise covariance matrix for multipoles up to 
several hundreds, one may pre-compute the all-important 
A-^N~^A matrix. After paying this high one-time cost, 
efficient and exact analysis is feasible using the methods 
described in this paper. 

Finally, we re-emphasize that the Gibbs sampler pro- 



vides a direct route to the exact likelihood (and to the 
Bayesian posterior), and it is much more reliable than ap- 
proximate methods. This issue has been demonstrated 
explicitly through the analysis of the three- year WMAP 
data, where an approximate likelihood between ^ = 13 
and 30 caused a non-negli gible bias in the spectral in- 
dex Us ijEriksen et al.l20'0q) . Using Gibbs sampling, such 
worries are greatly reduced. Further, this paper demon- 
strates that the method is in fact capable of analyzing the 
amount of data that will come from the Planck mission 
with reasonable computational resources. It therefore 
seems very likely that this method will play a significant 
role in the analysis of future Planck data. 
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APPENDIX 
SIGNAL SAMPLING 

Equations |S1 and have been written as simply as possible, for clarity when describing the extension of our Gibbs 
sampling algorithm to polarization. A more realistic treatment will involve multiple channels, symmetric beams, the 
pixel window function, and a cutoff at some value of £. In this appendix we write out, for reference, the log likelihood 
for s and the sampling equations that one derives from this. 

Let the index i run over channels. Let be the beam smoothing function, and W be the HEALPix pixel window 
smoothing function. If all channels are at the same resolution, then there is only one pixel window function; otherwise 
W will need an i index as well. Let P be a projection operator that removes all modes with £ above some cutoff. 
Note that P, W, and all commute, and P commutes with S. As before, m; are the maps and s is the signal. For 
generality, we also include a foreground component f^, which is not otherwise discussed in this paper. 

-21ogP(s|S,m„f„N„B„W) = s^S^'s + ^ (m, - B,Ws - B.Wf,)^ PN-'P(m, - B,Ws - B.Wf,) + const. (Al) 



From the above equation, it is clear that W can be absorbed into B^, so we do this and drop W from the equations. 
The equations for sampling s = x + y become: 



P + PSi/2 ^(B,PN-ipB,:)Si/2p 

i 

P + PS^/^ ^(BiPN"ipBi)Si/2p 



S-i/2px = PSi/2y B,PNrip(m, -B,f,) 



-1/2 



(A2) 



(A3) 



where now we have several maps Xi of Gaussian unit variates. Recall that these equations require the square root of 
S to be symmetric. 
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